winch: respect the enable_nan_canonicalization setting#12939
winch: respect the enable_nan_canonicalization setting#12939saulecabrera merged 9 commits intobytecodealliance:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR ensures Winch respects the existing enable_nan_canonicalization shared setting by adding a canonicalize_nan hook to the MacroAssembler trait and invoking it after scalar floating-point operations so NaN results are normalized to the canonical quiet-NaN bit pattern.
Changes:
- Add
MacroAssembler::canonicalize_nanand implement it for x64 and aarch64. - Invoke NaN canonicalization after scalar float arithmetic/conversion ops (and after rounding paths in the visitor).
- Add a scalar WAST test to validate canonical NaN bit patterns.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
winch/codegen/src/visitor.rs |
Inserts canonicalize_nan after scalar float ops and adds helper for rounding results. |
winch/codegen/src/masm.rs |
Extends the MacroAssembler trait with canonicalize_nan. |
winch/codegen/src/isa/x64/masm.rs |
Implements NaN detection + replacement with canonical NaN on x64. |
winch/codegen/src/isa/aarch64/masm.rs |
Implements NaN detection + replacement with canonical NaN on aarch64; stores shared flags in the masm. |
tests/misc_testsuite/canonicalize-nan-scalar.wast |
Adds scalar coverage for canonical NaN bit patterns and propagation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
fc5efb7 to
97d89bb
Compare
97d89bb to
0c5e168
Compare
Subscribe to Label ActionDetailsThis issue or pull request has been labeled: "winch"Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
|
@r-near we usually either stick with the assignment bot's suggestions (the reason I reviewed your last PR) or, in cases such as this one, tag a sub-area reviewer who knows part of the code best -- in this case @saulecabrera might be best to review Winch-specific work? |
|
I'll review. |
saulecabrera
left a comment
There was a problem hiding this comment.
Looking good, some comments inline.
| Ok(()) | ||
| } | ||
|
|
||
| fn canonicalize_nan(&mut self, reg: WritableReg, size: OperandSize) -> Result<()> { |
There was a problem hiding this comment.
Given that the canonicalization here is optional, let's rename this method to maybe_canonicalize_nan.
There was a problem hiding this comment.
Also, could you expand a bit on the rationale for adding a per-Masm implementation instead of an ISA-agnostic one? A priori, it seems that the MacroAssember exposes enough functionality to express canonicalization one layer above.
There was a problem hiding this comment.
The trait doesn't have float-and-compare branch or float-constant-load, so an agnostic version would need to go through float_cmp_with_set plus an integer branch, which is a few more instructions than the native path each backend can do.
I was leaning per-ISA, but perhaps we just add those primitives to the trait?
BTW I added the shared constants so that should at least cover the duplication concern.
There was a problem hiding this comment.
Renamed the method here: 7cdd58e (this PR)
winch/codegen/src/isa/x64/masm.rs
Outdated
| OperandSize::S32 => &0x7FC00000u32.to_le_bytes(), | ||
| OperandSize::S64 => &0x7FF8000000000000u64.to_le_bytes(), |
There was a problem hiding this comment.
To avoid repeating these immediate values across ISAs, can we introduce a constant at the Masm layer and import them here instead?
There was a problem hiding this comment.
Could we also implement this functionality for the non-scalar counterparts? I believe that all vector instructions are implemented in order to respect canonicalization. Once implemented we should be able to fully test simd/canonicalize-nan.wast in x86-64, by removing this line https://github.com/bytecodealliance/wasmtime/blob/main/crates/test-util/src/wast.rs#L614
There was a problem hiding this comment.
Ok here we go: cdef987 (this PR).
Added maybe_canonicalize_v128_nan to the masm trait and implemented it for x64
I removed the skip and it looks like the tests pass 🎉
winch/codegen/src/visitor.rs
Outdated
| where | ||
| M: MacroAssembler, | ||
| { | ||
| fn canonicalize_nan_for_round(&mut self, size: OperandSize) -> Result<()> { |
There was a problem hiding this comment.
Could we replace this function with a a call to self.masm.canonicalize_nan passing in the return register of the builtin function instead? The register holding the return value should be accessible via the builtins struct. That way we can avoid a function call (to canonicalize_nan_for_roundand going through all the hoops involved in pop_to_reg) and more importantly we reduce the risk of this function getting used in context different to round-result canonicalization, for which as far as I can tell, we are not enforcing any particular invariants.
There was a problem hiding this comment.
Ended up removing the function here: 1722807 (this PR).
I kept the pop/canonicalize/push inline since pop_to_reg is essentially a no-op when the value is already in a register... idk, it looked like wiring up the return register through float_round felt more invasive, but that might be a more purist solution
saulecabrera
left a comment
There was a problem hiding this comment.
Looks good to me, thanks for iterating on this. One last request and then we can land this one.
| Ok(()) | ||
| } | ||
|
|
||
| fn maybe_canonicalize_nan(&mut self, reg: WritableReg, size: OperandSize) -> Result<()> { |
There was a problem hiding this comment.
I wonder if it would be worth expressing the scalar version as the vector counterpart (using a single lane), to make it branchless. We'd probably need to fallback to the branch version when avx instructions are not supported though. For now let's leave a comment here stating that in the future if a more performant version is needed, a variation of the v128 branchless could be considered for scalars.
|
@saulecabrera could you re-add it to the merge queue? I fixed the test failure |
|
That is the right fix, yeah. |
Behavior of Winch changed in bytecodealliance#12939 in such a way that it's breaking differential fuzz testing with wasmi. Now that Winch supports NaN canonicalization this commit adjusts the fuzz test configuration generation to additionally enable it for Winch in addition to Cranelift.
Behavior of Winch changed in #12939 in such a way that it's breaking differential fuzz testing with wasmi. Now that Winch supports NaN canonicalization this commit adjusts the fuzz test configuration generation to additionally enable it for Winch in addition to Cranelift.
The
enable_nan_canonicalizationflag already flows through to Winch via the sharedFlags, but Winch was ignoring it. This addsmaybe_canonicalize_nan(scalar) andmaybe_canonicalize_v128_nan(SIMD) methods to theMasmtrait that, when the flag is set, emit a NaN-detecting sequence to replace results with the canonical quiet NaN.Covers add, sub, mul, div, min, max, sqrt, ceil, floor, trunc, nearest, demote, and promote for both scalar and SIMD. Canonical NaN constants are shared across ISAs. Removes the
canonicalize-nan.wastWinch skip.